Method for structuring a bitstream for binary multimedia descriptons and a method for parsing this bitstream

ABSTRACT

For structuring a bitstream for binary multimedia descriptions, binary identifiers (BIDs) are positioned on at least one regular positioning grid. Parsing is performed by checking these binary identifiers (BIDs) on the positions defined by the positioning grid.

STATE OF THE ART

[0001] Present solutions use a textural representation of thedescription structures for the description of audio-visual data contentin multimedia environments [MPEG-7 overview (version 3.0), ISO/IECITC1/SC29/WG11 N3445 Geneva, May/June 2000, pages 1 to 53]. For thistask a so-called description definition language (DLL) is used, which isderived from the Extensible Markup Language (XML). An MPEG-7 descriptionconsists of descriptors (D) or description schemes (DS), where thenumber of description elements can be variable. In the MPEG inputdocument M6061 (from the Geneva meeting), a binary format for the MPEG-7data has been proposed, which allows a more compact encoding ofdescription structures and thus savings in storage capacity and/ortransmission bandwidth.

ADVANTAGES OF THE INVENTION

[0002] With the steps of claim 1 a special structure for the binary,bitstream for binary multimedia descriptions is described, that allowsfast parsing of the bitstream with a very low complexity of a binaryparser. In State of the Art solutions there is not specified at whichpositions in a bitstream binary identifiers (BIDs) representing openingand closing tags of a multimedia description are placed. Therefore, abinary parser has to check each possible position of a bitstreamsequentially, until he finds a respective BID. By means of the presentinvention a structure for the bitstream is specified which allows todefine a grid of positions in the bitstreams where BIDs may be placed.On all other positions, a BID may not start. Depending on thebit-stream, this structure allows to speed up the parsing process, sincethe number of positions to check for a parser is reduced significantly.The parser knows already where to look for respective tags. For parsinghe has to check the binary identifiers (BIDs) on the positions definedby the positioning grid only. By this, also the computational complexityrequired for the binary parser is reduced.

[0003] A further object of the invention is to allow a binary parser toskip over a complete sub-description without having to parse thecomplete corresponding part of the bitstream. This objective isaccording to the steps of claim 10 achieved by assigning a same uniquenumber to each opening binary identifier and closing binary identifierof the same type.

[0004] For some applications consuming binary descriptions of multimediadata a complete sub-description of the overall description can beskipped, if the application is not interested in that specific part ofthe information. According to State of the Art solutions to skip asub-description it is necessary to parse the complete descriptionanyway, what is time and computation power consuming. If a binary parserwants now to skip a complete sub-description, it just has to search forthe closing BID with the same unique number as the opening BID of the Dor DS he wants to skip. By this, the parsing speed in case of skippingsub-descriptions is increased, and also the computational-complexity ofthe parser is reduced.

DRAWINGS

[0005] Embodiments of the invention are illustrated in the figures andexplained in detail in the description that follows.

[0006]FIG. 1 shows a structure of a bitstream with identical gridpositions for opening and closing BIDs;

[0007]FIG. 2 shows a structure of a bitstream with separate gridpositions for opening and closing BIDs.

DETAILED DESCRIPTION OF THE INVENTION

[0008] Before discussing the details of the invention some definitionsused in MPEG-7 and in the context of the remainder are presented:

[0009] Data: Data is audio-visual information that will be describedusing MPEG-7, regardless of storage, coding, display, transmission,medium, or technology.

[0010] Feature: A Feature is a distinctive characteristic of the datawhich signifies something to somebody.

[0011] Descriptor (D): A Descriptor is a representation of a Feature. ADescriptor defines the syntax and the semantics of the Featurerepresentation.

[0012] Description Scheme (DS): A Description Scheme specifies thestructure and semantics of the relationships between its components,which may be both Descriptors (Ds) and Description Schemes (DSs).

[0013] Description Definition Language (DDL): The Description DefinitionLanguage is a language that allows the creation of new DescriptionSchemes and, possibly, Descriptors. It also allows the extension andmodification of existing Description Schemes.

[0014] D/DS schema: The definition of a D/DS using the DDL, which isbased on the XML-schema language. Here, the components of a specificD/DS, which may themselves be other Ds/DSs, and their relationships aredefined.

[0015] D/DS instance: The instantiation of a certain D/DS, i.e. thedescription of actual data according to the elements defined in the D/DSschema.

[0016] Coded Description: A Coded Description is a Description that hasbeen encoded to fulfil relevant requirements such as compressionefficiency, error resilience, random access, etc.

[0017] Static DS: a DS that has been specified from the beginning andthat is contained in a known dictionary of Ds and DSs

[0018] Dynamic DS: a DS that is dynamically defined, using availablestatic Ds and DSs

[0019] In principle there are two ways to represent a D/DS-instance,either as text using the XML language, or in binary form. In [M6061] apossible binary form for the descriptions is described. It consistsmainly of binary identifiers (BIDs), which are unique for each possibleD or DS, and which can be hierarchically structured in order to improvethe parsing on bitstream level. An example of a simple DS in textualform is given below: <VideoSegment id = “VS1”> <MediaTime timeunit=“PT1N3 OF”> <MediaIncrTimePoint>0</MediaIncrTimePoint><MediaIncrDuration>106 </MediaIncrDuration> </MediaTime><GoFGoPHistogramD HistogramTypeInfo=“Average”> <ColorHistogramD><ColorSpaceD Space=“.HSV”/> <LinearColorQuantizationDQuantization=“linear”> <bin number>4</bin number> <bin number>4</binnumber> <bin number>4</bin number> </LinearCol. orQuantizationD><HistogramD HistogramNormFactor=“1” NumberHistogramBins=“4”><HistogramValue>444</HistogramValue> <HistogramValue>34</HistogramValue> <HistogramValue>58</HistogramValue><HistogramValue>564</HistogramValue> </HistogramD> </ColorHistogramD></GoFGoPHistogramD> <SegmentDecomposition Gap=“true”Overlap=“false”DecompositionType=“spatio-temporal”> <MovingRegion id=“MR2”> <MediaTimetimeunit=“PT1N30F”> <MediaIncrTimePoint>53</MediaIncrTimepoint> <MediaIncrDuration>23</MediaIncrDuration> </MediaTime> <ParametricobjectMotionXorigin=“0.000000” Yorigin=“0. 000000” ModelType=“2”><MotionParameters>0.000000</MotionParameters><MotionParameters>0.000000</MotionParameters><MotionParameters>0.000000</MotionParameters><MotionParameters>0.000000</MotionParameters><MotionParameters>0.000000</MotionParameters><MotionParameters>0.000000</MotionParameters> </ParametricobjectMotion></MovingRegion> </SegmentDecomposition> </VideoSegment>

[0020] The elements between the brackets “<. . . >” are referred to asXML-tags. In general, to each “opening tag” there corresponds a “closingtag”, which has the same name but with a leading “/” at the beginning.As an example, the closing tag for “<MediaTime>” would be“</MediaTime>”. The meaning of the tags and thus of the Ds or DSsdescribed by them is defined by the D or DS schema using the DDL. In thebinary form described in [M6061], a unique binary identifier (BID) isassigned to each such tag of a pre-defined set of tags according to aset of specified Ds and DSs. In order to make parsing on binary levelmore robust, the BIDS can comprise a leading sequence of bits which isunique in the bitstream, i.e. can not occur at other places than inBIDs. Using the so specified BIDs, the text form can be mapped intobinary form by replacing each opening and closing tag by the respectiveBID, using e.g. a leading “0” or “1” to mark if the BID represents anopening or a closing tag. The other values, i.e. the actual data likethe “0.0000” for the motion parameters or the numbers in the histogramvalues can be represented by usual integer, float or ASCII textrepresentation. Up to date, the BIDs representing the opening andclosing tags can be placed anywhere in the bitstream, depending on thesize of the actual data in between the tags.

[0021] The method of the invention discloses a structure for a bitstreamrepresenting descriptions of multimedia data. This structure defines aregular grid of positions, at which the opening and closing tagscorresponding to the XML-tags of Ds and DSs may be placed. In thepresent invention, three aspects can be distinguished:

[0022] Specification of positioning grid for opening and closing BIDs ina bitstream

[0023] Specification of different positioning grids for opening and forclosing BIDS

[0024] Unique numbering of opening and corresponding closing BIDs

[0025] Each of the aspects and the proposed solution in the light of thepresent invention will be described in the following.

[0026] A binary identifier BID for each specified descriptor D anddescription scheme DS is used to replace the originally textual (XML)opening and closing tag of the D or DS. In this respect, the BIDs arereferred to as opening and closing BIDs. The opening and closing BID ofa specific D or DS are mainly identical and differ just in that e.g. aleading/trailing “0” or “1” denotes if it is an opening or a closingBID.

[0027] By means of a first aspect of the present invention, a specialstructure for the bitstream is defined, in which opening or closing BIDsmay only start at certain positions. Such a structure and correspondinggrid is shown in FIG. 1. Here, the first BID may start at bit number M(M>0), and from there on BIDs may start at each following Nth bit (withN>1) until the end of the bitstream. For example, if M=1 and N=8, BIDswould always start at the beginning of a new byte in the bitstream. Theparameters M and N may be fixed in a respective specification, or theymay alternatively be transmitted in at the very beginning of thebitstream in its so-called header 1. By the latter approach, it ispossible to adapt the parameters to possible specific applicationrequirements.

[0028] In the case that there is only one positioning grid defined foropening and for closing BIDs, at those positions both kind of BIDs maystart. Therefore, the opening and closing BIDs must be distinguished,e.g. by a leading or trailing “0” and “1”, respectively, in order todistinguish them at the parser.

[0029] By means of a second aspect of the present invention, differentgrids for opening BIDs and for closing BIDs are specified. The grids aredefined in a way, that the occurence of an opening BID at a closing gridposition is not possible, and vice versa. Thus, the same BID can be usedfor an opening and a closing tag, without having to mark it as such witha leading or trailing bit. A corresponding structure and two respectivegrids, are shown in FIG. 2. Here, the first opening BID may start at bitnumber M (M>0), and from then on at each following Nth bit (N>1), asalready the case in FIG. 1. However, at those positions no closing tagmust start. The first closing tag may start at bit number K (K>M), andfrom then on at each following Lth bit (L>1). The parameters M, K, N andL shall be chosen such that the grids do not interfere with each other.All parameters may again be fixed in a respective specification, or theymay alternatively be transmitted in at the very beginning of thebitstream in its header 1.

[0030] Besides the aspects of just parsing a bitstream representingmultimedia descriptions, it may be very helpful for a parser if it couldskip complete sub-descriptions of an overall description, without havingto parse the complete corresponding part of the bitstream. This could bee.g. the case if the part to skip contains information which theapplication that consumes the description is not interested in, while itmay be very well interested in information beyond or past thatsub-description. In the current textual form, the completesub-description has to be parsed. By means of the third aspect of thepresent invention however, it is possible to skip a wholesub-description without parsing the corresponding bitstream completely.Therefore, a unique number is assigned to each opening BID of the sametype, i.e. corresponding to the same D or DS. The same unique number isassigned to the corresponding closing tag of the respectivesub-description. The unique numbers are added to the opening and closingBID as fixed or variable length codes, and written into the bitstream.When a parser now wants to skip a complete sub-description, it just hasto search for the closing BID with the corresponding unique number,instead of parsing the whole corresponding part of the bitstream.

1. Method for structuring a bitstream for binary multimediadescriptions, where binary identifiers (BIDS) representing opening andclosing tags of descriptors (D) and/or description schemes (DS) areused, characterised in that said binary identifiers (BIDs) arepositioned on at least one regular positioning grid within thebitstream.
 2. Method for parsing a bitstream structured according toclaim 1, characterised in that parsing is carried out by checking saidbinary identifiers (BIDS) on the positions defined by said positioninggrid
 3. Method according to one of claims 1 one or 2, wherein only oneregular positioning grid is provided for both opening as well as closingbinary identifiers (BIDs) and wherein opening and closing binaryidentifiers (BIDs) are marked in order to distinguish them when parsing.4. Method according to one of claims 1 or 2, wherein differentpositioning grids are provided for the opening and closing binaryidentifiers (BIDs) and wherein these different positioning grids arestructured in a way such that the different grids do not interfere witheach other.
 5. Method according to claim 3, wherein the first binaryidentifier (BID) starts at bit number M (M>0) in the bitstream and thefollowing binary identifiers (BIDs) at each following Nth bit (N>1). 6.Method according to claim 4, wherein the first opening binary identifier(BID) starts at a bit number M in the bitstream and the followingopening binary identifiers at each following Nth bit (N>1) and whereinthe first closing binary identifier (BID) starts at bit number K (K>M)and the following closing binary identifiers at each following Lth bit(L>1).
 7. Method according to claim 6, wherein the parameters M, K, Nand L are chosen such that the grids do not interfere with each other.8. Method according to one of claims 5, 6, or 7, wherein the parametersM and N or M, N, K and L respectively are predetermined fixedparameters.
 9. Method according to claims 5, 6 or 7, wherein theparameters M and N or M, N, K and L respectively are adaptivelyselectable and transmitted at the very beginning of the bitstream, inparticular in its header (1).
 10. Method according to one of claims 1 to9, wherein a unique number is assigned to each opening binary identifier(BID) of the same type, i. e. corresponding to the same descriptor (D)or description scheme (DS), and the same unique number is assigned tothe corresponding closing binary identifier (BID).
 11. Method accordingto claim 10, wherein the unique numbers are added to the opening andclosing binary identifiers (BIDS) as fixed or variable length codes, andwritten into the bitstream.
 12. Method according to one of claims 10 or11, wherein a parser, who wants to skip a complete sub-description, justhas to search for corresponding unique numbers for opening and closingbinary identifiers (BIDs) in the bitstream.